Variable-Length Word Encodings for Neural Translation Models

نویسندگان

  • Rohan Chitnis
  • John DeNero
چکیده

Recent work in neural machine translation has shown promising performance, but the most effective architectures do not scale naturally to large vocabulary sizes. We propose and compare three variable-length encoding schemes that represent a large vocabulary corpus using a much smaller vocabulary with no loss in information. Common words are unaffected by our encoding, but rare words are encoded using a sequence of two pseudo-words. Our method is simple and effective: it requires no complete dictionaries, learning procedures, increased training time, changes to the model, or new parameters. Compared to a baseline that replaces all rare words with an unknown word symbol, our best variable-length encoding strategy improves WMT English-French translation performance by up to 1.7 BLEU.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Characterization Results for Time-Varying Codes

Time-varying codes associate variable length code words to letters being encoded depending on their positions in the input string. These codes have been introduced in [8] as a proper extension of L-codes. This paper is devoted to a further study of time-varying codes. First, we show that adaptive Huffman encodings are special cases of encodings by time-varying codes. Then, we focus on three kin...

متن کامل

Special Cases of Encodings by Generalized Adaptive Codes

Adaptive (variable-length) codes associate variable-length codewords to symbols being encoded depending on the previous symbols in the input data string. This class of codes has been presented in [10, 11] as a new class of non-standard variable-length codes. Generalized adaptive codes (GA codes, for short) have been also presented in [10, 11] not only as a new class of nonstandard variable-leng...

متن کامل

A Convolutional Architecture for Word Sequence Prediction

We propose a convolutional neural network, named genCNN, for word sequence prediction. Different from previous work on neural networkbased language modeling and generation (e.g., RNN or LSTM), we choose not to greedily summarize the history of words as a fixed length vector. Instead, we use a convolutional neural network to predict the next word with the history of words of variable length. Als...

متن کامل

genCNN: A Convolutional Architecture for Word Sequence Prediction

We propose a convolutional neural network, named genCNN, for word sequence prediction. Different from previous work on neural networkbased language modeling and generation (e.g., RNN or LSTM), we choose not to greedily summarize the history of words as a fixed length vector. Instead, we use a convolutional neural network to predict the next word with the history of words of variable length. Als...

متن کامل

Convolutional Encoders for Neural Machine Translation

We propose a general Convolutional Neural Network (CNN) encoder model for machine translation that fits within in the framework of Encoder-Decoder models proposed by Cho, et. al. [1]. A CNN takes as input a sentence in the source language, performs multiple convolution and pooling operations, and uses a fully connected layer to produce a fixed-length encoding of the sentence as input to a Recur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015